10. Lab: Decoder

FCN: Decoder

Earlier, we covered transposed convolutions as one way of upsampling layers to higher dimensions or resolutions. There are, in fact, several ways to achieve upsampling. In this section, we will cover another method called bilinear upsampling or bilinear interpolation.

Bilinear Upsampling

Bilinear upsampling is a resampling technique that utilizes the weighted average of four nearest known pixels, located diagonally to a given pixel, to estimate a new pixel intensity value. The weighted average is usually distance dependent.

Let's consider the scenario where you have 4 known pixel values, so essentially a 2x2 grayscale image. This image is required to be upsampled to a 4x4 image. The following image gives a better idea of this process.

The unmarked pixels shown in the 4x4 illustration above are essentially whitespace. The bilinear upsampling method will try to fill out all the remaining pixel values via interpolation. Consider the case of P5 to understand this algorithm.

We initially calculate the pixel values at P12 and P34 using linear interpolation.

That gives us,

P12 = P1 + W1*(P2 - P1)/W2

and

P34 = P3 + W1*(P4 - P3)/W2

Using P12 and P34, we can obtain P5:

P5 = P12 + H1*(P34 - P12)/H2

For simplicity's sake, we assume that H2 = W2 = 1

After substituting for P34 and P12 the final equation for the pixel value of P5 is:

P5 = P1*(1 - W1)*(1 - H1) + P2*W1*(1 - H1) + P3*H1*(1 - W1) + P4*W1*H1

While the math becomes more complex, the above technique can be extended to RGB images as well.

Given the equation above, what would be the value for P5 if P1 = 150 , P2 = 200 , P3 = 65, P4 = 100, W1 = 0.5 , H1 = 0.7?

SOLUTION: 110.25

Coding Bilinear Upsampler

An optimized version of a bilinear upsampler has been provided for you in the utils module of the provided repo and can be implemented as follows:

output = BilinearUpSampling2D(row, col)(input)

Where,

input is the input layer,

row is the upsampling factor for the rows of the output layer,

col is the upsampling factor for the columns of the output layer, and

output is the output layer.

The following is an example of how you can use the above function:

output = BilinearUpSampling2D((2,2))(input)

The bilinear upsampling method does not contribute as a learnable layer like the transposed convolutions in the architecture and is prone to lose some finer details, but it helps speed up performance.